HSC: A Novel Method for Clustering Hierarchies of Networked Data
نویسنده
چکیده
Hierarchical clustering is one of the most powerful solutions to the problem of clustering, on the grounds that it performs a multi scale organization of the data. In recent years, research on hierarchical clustering methods has attracted considerable interest due to the demanding modern application domains. We present a novel divisive hierarchical clustering framework called Hierarchical Stochastic Clustering (HSC), that acts in two stages. In the first stage, it finds a primary hierarchy of clustering partitions in a dataset. In the second stage, feeds a clustering algorithm with each one of the clusters of the very detailed partition, in order to settle the final result. The output is a hierarchy of clusters. Our method is based on the previous research of Meyer and Weissel Stochastic Data Clustering and the theory of Simon and Ando on Variable Aggregation. Our experiments show that our framework builds a meaningful hierarchy of clusters and benefits consistently the clustering algorithm that acts in the second stage, not only computationally but also in terms of cluster quality. This result suggest that HSC framework is ideal for obtaining hierarchical solutions of large volumes of data.
منابع مشابه
Application of modified balanced iterative reducing and clustering using hierarchies algorithm in parceling of brain performance using fMRI data
Introduction: Clustering of human brain is a very useful tool for diagnosis, treatment, and tracking of brain tumors. There are several methods in this category in order to do this. In this study, modified balanced iterative reducing and clustering using hierarchies (m-BIRCH) was introduced for brain activation clustering. This algorithm has an appropriate speed and good scalability in dealing ...
متن کاملNeural-Smith Predictor Method for Improvement of Networked Control Systems
Networked control systems (NCSs) are distributed control systems in which the nodes, including controllers, sensors, actuators, and plants are connected by a digital communication network such as the Internet. One of the most critical challenges in networked control systems is the stochastic time delay of arriving data packets in the communication network among the nodes. Using the Smith predic...
متن کاملNGTSOM: A Novel Data Clustering Algorithm Based on Game Theoretic and Self- Organizing Map
Identifying clusters is an important aspect of data analysis. This paper proposes a noveldata clustering algorithm to increase the clustering accuracy. A novel game theoretic self-organizingmap (NGTSOM ) and neural gas (NG) are used in combination with Competitive Hebbian Learning(CHL) to improve the quality of the map and provide a better vector quantization (VQ) for clusteringdata. Different ...
متن کاملA novel local search method for microaggregation
In this paper, we propose an effective microaggregation algorithm to produce a more useful protected data for publishing. Microaggregation is mapped to a clustering problem with known minimum and maximum group size constraints. In this scheme, the goal is to cluster n records into groups of at least k and at most 2k_1 records, such that the sum of the within-group squ...
متن کاملRepeated Record Ordering for Constrained Size Clustering
One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology, and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggreg...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1711.11071 شماره
صفحات -
تاریخ انتشار 2017